Unsupervised learning of link specifications: deterministic vs. non-deterministic
نویسندگان
چکیده
Link Discovery has been shown to be of utter importance for the Linked Data Web. In previous works, several supervised approaches have been developed for learning link specifications out of labelled data. Most recently, genetic programming has also been utilized to learn link specifications in an unsupervised fashion by optimizing a parametrized pseudo-F-measure. The questions underlying this evaluation paper are twofold: First, how well do pseudo-F-measures predict the real accuracy of non-deterministic and deterministic approaches across different types of datasets? Second, how do deterministic approaches compare to non-deterministic approaches? To answer these questions, we evaluated linear and Boolean classifiers against classifiers computed by using genetic programming on six different data sets. We also studied the correlation between two different pseudo-F-measures and the real F-measures achieved by the classifiers at hand. Our evaluation suggests that pseudoF-measures behave differently on the synthetic and real data sets.
منابع مشابه
Inferring Hierarchical Clustering Structures by Deterministic Annealing
The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an objective function for tree...
متن کاملBayesian Melding of Deterministic Models and Kriging for Analysis of Spatially Dependent Data
The link between geographic information systems and decision making approach own the invention and development of spatial data melding method. These methods combine different data sets, to achieve better results. In this paper, the Bayesian melding method for combining the measurements and outputs of deterministic models and kriging are considered. Then the ozone data in Tehran city are analyze...
متن کاملThe State of Deterministic Thinking among Mothers of Autistic Children
Objectives: The purpose of the present study was to investigate the effectiveness of cognitive-behavior education on decreasing deterministic thinking in mothers of children with autism spectrum disorders. Methods: Participants were 24 mothers of autistic children who were referred to counseling centers of Tehran and their children’s disorder had been diagnosed at least by a psychiatrist and...
متن کاملAnalysis of a Probabilistic Record Linkage Technique without Human Review
We previously developed a deterministic record linkage algorithm demonstrating sensitivities approaching 90% while maintaining 100% specificity. Substantially better performance has been reported using probabilistic linkage techniques; however, such methods often incorporate human review into the process. To avoid human review, we employed an estimator function using the Expectation Maximizatio...
متن کاملInferring Hierarchical Clustering Structures by Deterministic Annealingby Deterministic Annealing
The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an ,aL”G.,P 4Lnrt;nn C.-e hM ,...
متن کامل